{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# TensorFlow Script Mode - Using Shell scripts\n",
    "\n",
    "Starting from TensorFlow version 1.11, you can use a shell script as\n",
    "your training entry point. Shell scripts are useful for many use cases including:\n",
    "\n",
    "- Invoking Python scripts with specific parameters\n",
    "- Configuring framework dependencies\n",
    "- Training using different programming languages\n",
    "\n",
    "For this example, we use [a Keras implementation of the Deep Dream algorithm](https://github.com/keras-team/keras/blob/2.2.4/examples/deep_dream.py). We can use the same technique for other scripts or repositories including [TensorFlow Model Zoo](https://github.com/tensorflow/models) and [TensorFlow benchmark scripts](https://github.com/tensorflow/benchmarks/tree/master/scripts/tf_cnn_benchmarks)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Getting the image for training\n",
    "For training data, let's download a public domain image:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "data_dir = os.path.join(os.getcwd(), 'training')\n",
    "\n",
    "os.makedirs(data_dir, exist_ok=True)\n",
    "data_dir"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!wget -O training/dark-forest-landscape.jpg https://www.goodfreephotos.com/albums/other-landscapes/dark-forest-landscape.jpg"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from IPython.display import Image\n",
    "Image(filename='training/dark-forest-landscape.jpg') "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Download the training script\n",
    "\n",
    "Let's start by downloading the [deep_dream](https://github.com/keras-team/keras/blob/2.2.4/examples/deep_dream.py) example script from Keras repository. This script takes an image and uses deep dream algorithm to generate\n",
    "transformations of that image."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!wget https://raw.githubusercontent.com/keras-team/keras/2.2.4/examples/deep_dream.py"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The script **deep_dream.py** takes two positional arguments:\n",
    "- `base_image_path`: Path to the image to transform.\n",
    "- `result_prefix`: Prefix of all generated images.\n",
    "\n",
    "### Creating the launcher script\n",
    "\n",
    "We need to create a launcher script that sets the `base_image_path` \n",
    "and `result_prefix`, and invokes **deep_dream.py**:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%writefile launcher.sh \n",
    "\n",
    "BASE_IMAGE_PATH=\"${SM_CHANNEL_TRAINING}/dark-forest-landscape.jpg\"\n",
    "RESULT_PREFIX=\"${SM_MODEL_DIR}/dream\"\n",
    "\n",
    "python deep_dream.py ${BASE_IMAGE_PATH} ${RESULT_PREFIX}\n",
    "\n",
    "echo \"Generated image $(ls ${SM_MODEL_DIR})\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**SM_CHANNEL_TRAINING** and **SM_MODEL** are environment variables created by the SageMaker TensorFlow\n",
    "Container in the beginning of training. Let's take a more detailed look at then: \n",
    "\n",
    "- **SM_MODEL_DIR**: the directory inside the container where the training model data must be saved inside the container, i.e. /opt/ml/model.\n",
    "- **SM_TRAINING_CHANNEL**: the directory containing data in the 'training' channel. \n",
    "\n",
    "For more information about training environment variables, please visit [SageMaker Containers](https://github.com/aws/sagemaker-containers#list-of-provided-environment-variables-by-sagemaker-containers)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Test locally using SageMaker Python SDK TensorFlow Estimator\n",
    "You can use the SageMaker Python SDK TensorFlow estimator to easily train locally and in SageMaker.\n",
    "Let's set **launcher.sh** as the entry-point and **deep_dream.py** as a dependency:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "entry_point='launcher.sh'\n",
    "dependencies=['deep_dream.py']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For more information about the arguments `entry_point` and `dependencies` see the [SageMaker TensorFlow](https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/tensorflow/README.rst#sagemakertensorflowtensorflow-class) documentation.\n",
    "\n",
    "This notebook shows how to use the SageMaker Python SDK to run your code in a local container before deploying to SageMaker's managed training or hosting environments. Just change your estimator's train_instance_type to local or local_gpu. For more information, see: https://github.com/aws/sagemaker-python-sdk#local-mode.\n",
    "\n",
    "In order to use this feature you'll need to install docker-compose (and nvidia-docker if training with a GPU). Running following script will install docker-compose or nvidia-docker-compose and configure the notebook environment for you.\n",
    "\n",
    "Note, you can only run a single local notebook at a time."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!/bin/bash ./setup.sh"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's train locally here to make sure everything runs smoothly first."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "train_instance_type='local'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We create the TensorFlow Estimator, passing the flag `script_mode=True`. For more information about script mode, see https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/tensorflow/README.rst#preparing-a-script-mode-training-script:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import sagemaker\n",
    "from sagemaker.tensorflow import TensorFlow\n",
    "\n",
    "estimator = TensorFlow(entry_point=entry_point,\n",
    "                       dependencies=dependencies,\n",
    "                       train_instance_type='local',\n",
    "                       train_instance_count=1,\n",
    "                       role=sagemaker.get_execution_role(),\n",
    "                       framework_version='1.13',\n",
    "                       py_version='py3',\n",
    "                       script_mode=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To start a training job, we call `estimator.fit(inputs)`, where inputs is a dictionary where the keys, named **channels**, have values pointing to the data location:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "inputs = {'training': f'file://{data_dir}'}\n",
    "\n",
    "estimator.fit(inputs)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`estimator.model_data` contains the S3 location where the contents of **/opt/ml/model**\n",
    "were save as tar.gz file. Let's untar and download the model:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!aws s3 cp {estimator.model_data} model.tar.gz\n",
    "!tar -xvzf model.tar.gz"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can see the resulting image now:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from IPython.display import Image\n",
    "Image(filename='dream.png')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Training in SageMaker"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "After you test the training job locally, upload the dataset to an S3 bucket so SageMaker can access the data during training:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import sagemaker\n",
    "\n",
    "training_data = sagemaker.Session().upload_data(path='training', key_prefix='datasets/deep-dream')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The `upload_data` call above returns an S3 location that can be used during the SageMaker Training Job"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "training_data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To train in SageMaker:\n",
    "- change the estimator argument **train_instance_type** to any SageMaker ML Instance Type available for training.\n",
    "- set the **training** channel to a S3 location."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "estimator = TensorFlow(entry_point='launcher.sh',\n",
    "                       dependencies=['deep_dream.py'],\n",
    "                       train_instance_type='ml.c4.xlarge',\n",
    "                       train_instance_count=1,\n",
    "                       role=sagemaker.get_execution_role(),\n",
    "                       framework_version='1.13',\n",
    "                       py_version='py3',\n",
    "                       script_mode=True)\n",
    "\n",
    "estimator.fit(training_data)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The `estimator.fit` call bellow starts training and creates a data channel named `training` with the contents of the\n",
    " S3 location `training_data`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "estimator.fit(training_data)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "conda_tensorflow_p36",
   "language": "python",
   "name": "conda_tensorflow_p36"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
